What Yelp Fake Review Filter Might Be Doing?
نویسندگان
چکیده
Online reviews have become a valuable resource for decision making. However, its usefulness brings forth a curse ‒ deceptive opinion spam. In recent years, fake review detection has attracted significant attention. However, most review sites still do not publicly filter fake reviews. Yelp is an exception which has been filtering reviews over the past few years. However, Yelp’s algorithm is trade secret. In this work, we attempt to find out what Yelp might be doing by analyzing its filtered reviews. The results will be useful to other review hosting sites in their filtering effort. There are two main approaches to filtering: supervised and unsupervised learning. In terms of features used, there are also roughly two types: linguistic features and behavioral features. In this work, we will take a supervised approach as we can make use of Yelp’s filtered reviews for training. Existing approaches based on supervised learning are all based on pseudo fake reviews rather than fake reviews filtered by a commercial Web site. Recently, supervised learning using linguistic n-gram features has been shown to perform extremely well (attaining around 90% accuracy) in detecting crowdsourced fake reviews generated using Amazon Mechanical Turk (AMT). We put these existing research methods to the test and evaluate performance on the real-life Yelp data. To our surprise, the behavioral features perform very well, but the linguistic features are not as effective. To investigate, a novel information theoretic analysis is proposed to uncover the precise psycholinguistic difference between AMT reviews and Yelp reviews (crowdsourced vs. commercial fake reviews). We find something quite interesting. This analysis and experimental results allow us to postulate that Yelp’s filtering is reasonable and its filtering algorithm seems to be correlated with abnormal spamming behaviors.
منابع مشابه
Understanding the Yelp review filter: An exploratory study
Reviews on Yelp.com can be an important factor in driving customers to a business. However, many business owners have expressed concern with Yelp's review filtering system, which was created to flag low–quality or fake reviews. This study performs a content analysis of a subset of Yelp restaurant and religious organization reviews, visible and filtered, exploring signals from the reviews or the...
متن کاملFake It Till You Make It: Reputation, Competition, and Yelp Review Fraud
Consumer reviews are now part of everyday decision-making. Yet, the credibility of these reviews is fundamentally undermined when businesses commit review fraud, creating fake reviews for themselves or their competitors. We investigate the economic incentives to commit review fraud on the popular review platform Yelp, using two complementary approaches and datasets. We begin by analyzing restau...
متن کاملPoster: Spotting Suspicious Reviews via (Quasi-)clique Extraction
How to tell if a review is real or fake? What does the underworld of fraudulent reviewing look like? Detecting suspicious reviews has become a major issue for many online services. We propose the use of a clique-finding approach to discover well-organized suspicious reviewers. From a Yelp dataset with over one million reviews, we construct multiple Reviewer Similarity graphs to link users that ...
متن کاملSpotting Suspicious Reviews via (Quasi-)clique Extraction
How to tell if a review is real or fake? What does the underworld of fraudulent reviewing look like? Detecting suspicious reviews has become a major issue for many online services. We propose the use of a clique-finding approach to discover well-organized suspicious reviewers. From a Yelp dataset with over one million reviews, we construct multiple Reviewer Similarity graphs to link users that ...
متن کاملSmoke Screener or Straight Shooter: Detecting Elite Sybil Attacks in User-Review Social Networks
Popular User-Review Social Networks (URSNs)— such as Dianping, Yelp, and Amazon—are often the targets of reputation attacks in which fake reviews are posted in order to boost or diminish the ratings of listed products and services. These attacks often emanate from a collection of accounts, called Sybils, which are collectively managed by a group of real users. A new advanced scheme, which we te...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013